Estimation of Subcellular Proteomes in Bacterial Species

نویسندگان

  • Brian R. King
  • Lance Latham
  • Chittibabu Guda
چکیده

Computational methods for predicting the subcellular localization of bacterial proteins play a crucial role in the ongoing efforts to annotate the function of these proteins and to suggest potential drug targets. These methods, used in combination with other experimental and computational methods, can play an important role in biomedical research by annotating the proteomes of a wide variety of bacterial species. We use the ngLOC method, a Bayesian classifier that predicts the subcellular localization of a protein based on the distribution of n-grams in a curated dataset of experimentallydetermined proteins. Subcellular localization was predicted with an overall accuracy of 89.7% and 89.3% for Gramnegative and Gram-positive bacteria protein sequences, respectively. Through the use of a confidence score threshold, we improve the precision to 96.6% while covering 84.4% of Gram-negative bacterial data, and 96.0% while covering 87.9% of Gram-positive data. We use this method to estimate the subcellular proteomes of ten Gram-negative species and five Gram-positive species, covering an average of 74.7% and 80.6% of the proteome for Gram-negative and Gram-positive sequences, respectively. The current method is useful for large-scale analysis and annotation of the subcellular proteomes of bacterial species. We demonstrate that our method has excellent predictive performance while achieving superior proteome coverage compared to other popular methods such as PSORTb and PLoc.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PSORTdb—an expanded, auto-updated, user-friendly protein subcellular localization database for Bacteria and Archaea

The subcellular localization (SCL) of a microbial protein provides clues about its function, its suitability as a drug, vaccine or diagnostic target and aids experimental design. The first version of PSORTdb provided a valuable resource comprising a data set of proteins of known SCL (ePSORTdb) as well as pre-computed SCL predictions for proteomes derived from complete bacterial genomes (cPSORTd...

متن کامل

MetazSecKB: the human and animal secretome and subcellular proteome knowledgebase

The subcellular location of a protein is a key factor in determining the molecular function of the protein in an organism. MetazSecKB is a secretome and subcellular proteome knowledgebase specifically designed for metazoan, i.e. human and animals. The protein sequence data, consisting of over 4 million entries with 121 species having a complete proteome, were retrieved from UniProtKB. Protein s...

متن کامل

Subcellular proteomics—where cell biology meets protein chemistry

The development of compartments in eukaryotic cells and the distribution of nuclear-encoded proteins underlies the expansion of plant genomes, the proliferation of multigene families and the specialization of cellular functions. The exploration of the proteome of the cell in terms of the collection of its subcompartments is therefore both a practical approach and also a function led necessity t...

متن کامل

Comprehensive subcellular topologies of polypeptides in Streptomyces

BACKGROUND Members of the genus Streptomyces are Gram-positive bacteria that are used as important cell factories to produce secondary metabolites and secrete heterologous proteins. They possess some of the largest bacterial genomes and thus proteomes. Understanding their complex proteomes and metabolic regulation will improve any genetic engineering approach. RESULTS Here, we performed a com...

متن کامل

PSORTdb: expanding the bacteria and archaea protein subcellular localization database to better reflect diversity in cell envelope structures

Protein subcellular localization (SCL) is important for understanding protein function, genome annotation, and has practical applications such as identification of potential vaccine components or diagnostic/drug targets. PSORTdb (http://db.psort.org) comprises manually curated SCLs for proteins which have been experimentally verified (ePSORTdb), as well as pre-computed SCL predictions for deduc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009